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Abstract 

The node set of a two-mode network consists of two disjoint sub¬ 
sets and all its links are linking these two subsets. The links can be 
weighted. We developed a new method for identifying important sub¬ 
networks in two-mode networks. The method combines and extends 
the ideas from generalized cores in one-mode networks and from (p, q )- 
cores for two-mode networks. In this paper we introduce the notion 
of generalized two-mode cores and discuss some of their properties. 

An efficient algorithm to determine generalized two-mode cores and 
an analysis of its complexity are also presented. For illustration some 
results obtained in analyses of real-life data are presented. 
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1 Introduction 

Network analysis is an approach to the analysis of relational data. In this 
paper we deal with the analysis of two-mode networks mg. A two-mode 
network is a network in which the set of nodes consists of two disjoint subsets 
and its links are linking these two subsets. 

The traditional approach to the analysis of two-mode networks is usually 
indirect: first a two-mode network is converted into one of the two corre¬ 
sponding one-mode projections, and afterward it is analyzed using standard 
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network analysis methods [3j. Direct methods for the analysis of two-mode 
networks are quite rare HM. We can use bipartite statistics on degrees, 
generalized blockmodeling, (p, g)-cores, two-mode hubs and authorities, 4- 
ring weights, bi-communities, two-mode clustering, bipartite cores, and some 
others. Many methods for identifying important subnetworks are available 
for one-mode networks (measures of centrality and importance, generalized 
cores, line islands, node islands, clustering, blockmodeling, etc.). We present 
a new direct method which can be used for the identification of important 
subnetworks in two-mode networks with respect to selected node properties. 

We combine the ideas from generalized cores in one-mode networks and 
from (p, g)-cores for two-mode networks into the notion of generalized two¬ 
mode cores. We developed and implemented an algorithm for identifying 
generalized two-mode cores for selected node properties and given thresholds 
for both subsets of nodes. We also propose an algorithm to find the nested 
generalized two-mode cores for one fixed threshold value. 

In the next section we survey the works that contain the ideas we used 
for the development of our method. In Section [3] we present an algorithm 
for identifying the generalized two-mode cores. We list some node properties 
that are used as measures of importance. We also present some properties 
of generalized two-mode cores. We prove that for equivalent properties mea¬ 
sured in ordinal scales the sets of generalized two-mode cores are the same. 
The algorithm, the proof of its correctness, and a simple analysis of its com¬ 
plexity are presented in Section |4} In Section [5] some results obtained in 
analyses of real-life data are presented. 


2 Related work 

The notion of k- core was introduced by Seidman (1983) |7j. Let Q = (V, C) 
be a graph with n — |V| nodes and m = \C\ links. Let k be a fixed integer 
and let deg(w) be the degree of a node v G V. A subgraph %k = (Ck,C\Ck) 
induced by the subset G C V is called a k-core iff deg Wfc (n) > k, for all 
v G Ck, and 1-Lk is the maximal such subgraph. If we replace the degree with 
some other node property, we get the notion of generalized cores as it was 
introduced in [Sj. The node property can be a node degree, maximum of 
incident link weights, sum of incident link weights, etc. They are described 
in more details in Section |3l 

The other possible generalization of fc-cores is their extension to two-mode 
networks. The notion of (p, g)-cores was introduced in [9]. A subset C C V 
determines a (p, g)-core in a two-mode network A f = ((Vi, V 2 ), £), V = ViU V 2 
iff in the subnetwork /C = ((Ci, C 2 ), C\C),Ci = C D Vi, C 2 = C ft V 2 induced by 
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C it holds that for all v G C\ : deg^(n) > p and for all u G C 2 : deg yc (n) > q, 
and C is the maximal such subset in V. 

We combined the ideas from generalized cores and (p, q )-cores into the 
notion of generalized two-mode cores. Generalized two-mode cores are de¬ 
fined similarly to (p, q )-cores with one exception - instead of using the degree 
of nodes, we now allow also some other properties of nodes. The properties 
of nodes on subsets Vi and V 2 can be different. The detailed definition is 


given in Section 3.1 


Other types of two-mode subnetworks were discussed in the literature. A 
bipartite core is defined as a complete two-mode subnetwork mg. Bipartite 
cores are determined by the size of each subset of vertices. 

Similar methods are varieties of a community detection in two-mode net¬ 
works: with maximization of monotone function nn, with a dual projection 
[12]. with properties of the eigenspectrum of the network’s matrix [13], with 
label propagation and recursive division of the two types of nodes ra, with 
the stochastic block modeling (El, and many others. Another very similar 
method is a bipartite clustering mm- The substantial difference is that 
these methods are determining a clustering of the whole set of nodes and our 
method determines only an important subset. 

The generalized two-mode cores depend on selected node properties that 
are expressing different aspects of the network structure (for example, the 
intensity of links). They are also using different criteria. Therefore our 
method represents a new approach to two-mode network analysis. It does 
not represent an improvement of any existing method, but a generalization 
of (p, q)- cores. 


3 Algorithms for generalized two-mode cores 

As mentioned in Section [l] the algorithms for identifying fc-cores, generalized 
cores, and (p,q )~cores have already been developed (9j [8, EZj- We propose a 
new algorithm, which combines and extends the ideas from generalized cores 
in one-mode networks and from (p, q )-cores in two-mode networks. Besides 
implementing the new algorithm, we also prove its correctness and analyze 
its complexity. For testing the usefulness of the method we applied it on real 
networks. 

3.1 Properties of nodes 

For a network J\f = (V, jC, w) and a weight function w : C —> a property 
function f(v, C ) G Mq is defined for all v G V and C C V. A subset C induces 
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the subnetwork to which the evaluation of the property function is limited. 
In an undirected network it holds: w(u,v ) = w(v,u ) for all pairs of nodes 
u,v eV. 

Let us denote the neighborhood of a node v as N(v) and the neighborhood 
of a node v within the subset C as N(v, C) = N(v)nC. The neighborhood of a 
node v within the subset C including v we denote as N + (v, C) = N(v, C)U{u}. 
Let us also denote a measurement on nodes (degree, centrality, etc.) as 
t V —y Mq". 

We say that the property function f(v,C ) is local iff 

f(v,C) = f(v,N(v,C )) for all v G V and C C V. 

The property function f(v,C) is monotonic iff 


Ci C C 2 => VreV: f(v,Ci) < f(v,C 2 ). 


Some node properties (/i - f w ) were proposed in [ 8 ]. In the Tab. [I] are 
listed examples of property functions. 

All the listed functions have the property f(v, 0) = 0 for all v e V. 

It can easily be verified that all the listed property functions are local and 
monotonic. An example of a non-monotonic function would be the average 
weight 


f(v,C ) 


deg c (u) 




u£N(v,C) 


for N(v,C) 7 ^ 0, otherwise f(v,C ) = 0. An example of a non-local function 
is the number of cycles or closed walks of length k, k > 4, through a node. 


3.2 Generalized two-mode cores 

Definition 3.1. Let JV = ((Vi, V 2 ), C, (/, g), w), V = Vi U V 2 be a finite 
two-mode network - the sets V and £ are finite. Let V{V) be a power set 
of the set V. Let functions / and g be defined on the network A f: /, g : 
V x V(V) —► M+. 

A subset of nodes C C V in a two-mode network J\f is a generalized two¬ 
mode core C = Core(p, q\ /, g), p,q G if and only if in the subnetwork 
JC = ((Ci, C 2 ), £|C), £1 = C fl Vi,C 2 = C n V 2 induced by C it holds that for 
all v G Ci : f(v, C) > p and for all v G C 2 : g(v, C) > q. and C is the maximal 
such subset in V. 

When the functions / and g are clear from context and the parameters p 
and q are fixed, we use the abbreviation Core(p, q) = Cor e(p, q ; /, g). 
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Formula 


Table 1: Examples of property functions. 

Description 


fi (v,C) = deg c (v) 

f 2 (v,C) = indeg c (u) 

f 3 (v,C) = outdeg c (u) 

f 4 (v,C) = indeg c (u) + outdeg c (v) 

fs(v,C) = wdeg c (v) = E u eN(v,c ) w ( v , u ) 

fe{v,C ) = mweight c (u) = max ue N ( v ,qw(v,u) 
h(v,C ) = pdeg c (u) = if deg(u) > 0 else 

f 7 (v,C) = 0 

/ S (!.,C) = density^) = if 

deg(u) > 0 else f$(v,C) = 0 
f 9 (v,C) = degrange c (n) = 
max ueA r(„ iC ) deg(n) - min ue7V( ^ iC) deg(n) 

fio(v,C) = tdegrange c (v) = 
max ueN+(v,c) deg (u) 
min ue jv+('w,c) deg(n) 

fn(v,C) = pweight c (n) = ^l‘ 6JV(t ’’ c) if 
T,ueN(v) w ( v ’ u ) > 0 else fn(v,C) = 0 
fn(v,C) = triangles c (v) 


Degree of a node v within C. 

Input degree of a node v within C. 

Output degree of a node v within C. 

For a directed network / 4 = j\. 

Sum of weights of links within C that have 
a node v for an end node. 

Maximum of weights of all links within C 
that have a node v for an end node. 
Proportion of N(v,C) in N(v). 

Relative density of the neighborhood of a 
node v within C. 

Range of degrees of neighbors of a node v 
within C with respect to their degrees. 

Total range of degrees of neighbors of a 
node v. 

Proportion of weights of links with a node 
v as an end node that have the other end 
node within C. 

Number of triangles through a node v with 
all three nodes in C. 


Mv,C) = sum C(v,t) = J2ueN(v,C) *(«)■ 
fu(v,C) = max C(v,t) = ma x uGN ( v £)t{u). 
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A set of all generalized two-mode cores for a network A f = ((Vi, Vo), jC, (/, g),w ) 
is defined as 


Cores(A0 = (Core(p, q] f, g)',p ERq A q G Mg }• 
In this set we can define a relation 

Core(p, g; /, g) □ Core(p', q\ /, g) 


Ci(p,q\f,g) Q Ci(p',q'-,f,g) A C 2 (p,q\f,g) CC 2 (p',q';f,g). 

Because Cores(A/") C V(V) and ('P(V), C) is a partially ordered set, also 
(Cores(AT), C) and (Cores(7V), C) are partially ordered. 

For the generalized two-mode cores we have: 

• Core(0, 0) = V, 

• the subnetwork /C = ((Ci, C 2 ), £|C), C = Core(p, q; /, g) is not neces¬ 
sarily connected. 

Lemma 3.1. LetA/"= ((Vi, V 2 ), C, (/, g), w) and its "mirror" J\f = ((V 2 , Vi), £, (g, /), w) 
be two-mode networks. It holds 


Cor e^(p,q] f,g) = Core A r(q,p\g,f). 


Lemma 3.2. Let F C M be a set of values of the property / (codomain of 
the property function /) and <p : F —> a strictly increasing function. 

Then 

Core(p, q\ /, g) = Cor e(tp(p), q-,<pof,g). 


Corollary 3.1. Let F Cl and p : F —> Rq be as in Lemma 3.2. Then 

Cor ■e(q,p;g,f) = Cove(q, <p(p); g, <p o /). 

The proofs for Lemmas 3T and 3_2 and Corollary |3.1| are simple. Lemma 
3.2 and Corollary |3.1| tell us that equivalent properties measured in ordinal 


scales produce the same generalized two-mode cores. 


3.3 Boundary for threshold values 

Theorem 3.1. In a two-mode network J\f = ((Vi, V 2 ), jC, (/, g),w ) for / and 
g monotonic functions it holds: 

(Pi <P 2 A qi < q 2 ) =^ Core(p 2 , q 2 ) C Core(pi,gi). 
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Proof: By definition of a generalized two-mode core C = Core(p 2 , Q 2 ) it 
holds: 

Vn G C\ : f(v,C ) > p 2 and Vn G C 2 : g(v,C) > g 2 

and C is maximal such subset of nodes. Because / and g are monotonic and 
we have p\ < p 2 and q\ < q 2 it also holds: 

Vn G Ci : f(v,C) > pi and Vv G C 2 : g(v,C) > qi. 

But pi < p 2 and q\ < g 2 so C is not necessarily the maximal subset of nodes 
that defines Core(pi, < 71 ). Therefore: 

Core(p 2 , q 2 ) = C C Core(pi, < 71 ). 


For a given subset C C V let p(C) = min^gCj f(v, C) be the minimum 
property value in the set C 1 = CflVi and q(C) = min,, e c 2 g(v, C) the minimum 
property value in the set C 2 = C D V 2 . It holds C C Core(p(C), q(C)). 

For a given two-mode network A f = (V, C, ( f,g),w ) let P = {p(C) : C C 
V} be the set of all possible values of p(C) and Q = {q(C) : C C V} the 
set of all possible values of q(C). P and Q are finite sets. Therefore we can 
enumerate their elements: 

P = {PhP2,---,Pr}, Pi<Pi+h 
Q = {qi,Q 2 ,---,qs}, Qi<Qi+ 1 - 

For fixed functions / and g we are interested only in (p, q) pairs that are 
determining different nonempty generalized two-mode cores. 

It is clear from the condition p, q > 0 that if we look at the Cartesian 
coordinate system with (p, q) axes (Fig. [I]), the region of all possible pairs of 
thresholds p and q is limited to the first quadrant. The missing boundary of 
this region is a broken line in a shape of stairs. 

For p,g G Rj let II (p) = {C : C C V A p(C) > p} be the set of sets 
for which the property value of the first subset is at least equal to p, and 
r(g) = {C : C C V A q{C) > q] be the set of sets for which the property value 
of the second subset is at least equal to q. Let G(p) = {q(C) : C G II(p)} be 
the set of all possible values q(C) for which the set C belongs to the set II(p) 
and qn(p) = maxG(p) is the maximum such value q(C). Let H(q) = (p(C) : 
C G r(g)} be the set of all possible values p{C) for which the set C belongs 
to the set T(g) and pv(q) — max H(q) is the maximum such value p(C). 

Lemma 3.3. The following properties hold for p, p' G P and q, q' G Q: 
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Figure 1: The region of all possible threshold pairs (p, q) that determine 
generalized two-mode cores. 

1. In the set T(g n ( P )) exists at least a set C for which it holds p{C ) = p 0 
and q(C ) = qn( Po )- Therefore: 

r(gn( P )) 7- 0. 

Similarly: II(pr(g)) 7 ^ 0- 

2 . It holds C C Cor e(p(C),q(C)) and in all sets in T(g n ( P )) the property 
values for the second subset are at least equal to qn( P )- It also holds: 

c G T(gn( p )) =>■ c c Core(p, gn( P ))- 
Similarly: C G II(pr(<?)) C C Core(p r ( ? ), <?)• 

3. gn( P ) is the maximum such q for which a nonempty core Core(p, q) 
exists: 

q > qn(p) => Cor e(p, q) = 0. 

Similarly: p > pr( q ) ==>■ Cor e(p, q) = 0. 

4. qu is a decreasing function: 

p <p =>■ gn( P ') < 9n( P )- 
Similarly: q < q' =>■ p r ( 9 ') < Pr( 9 )- 
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Proof: The proofs for all of the listed properties are simple. Let us show 
just the first part of the property 4. 

n(p') = {C : C C V Ap(C) >p'} 

c {C : C C V Ap(C) >p} = n(p) 

This implies G(p') C G(p) and therefore gn(p') < Qu(p)- ■ 

From these properties it follows that for each p £ P there exist a maxi¬ 
mum q £ Q: q = qu( p y, and for each q £ Q there exists a maximum p £ P: 
P = Pr(q)- This is a formal proof of the staircase shape of the boundary. On 
this basis we can develop an algorithm for determining the boundary of the 
set (P, Q) in the (p, q) coordinate system - see Algorithm [I] 


Algorithm 1 The algorithm to determine the boundary of the set (P, Q) in 
the (p, q) coordinate system. 

Input: P = {p\,P 2 , ■ ■ ■ ,Pr},Pi < Pi+i, Q = {qi,q2,---,qa},Qi<Qi+i- 
Output: boundary set boundary C P x Q . 

Algorithm: 

Q’max 0 

boundary = 0 

for p £ reverse(P) do 
Q = Quip) 

if q > q max then 

Qmax Q 

boundary = boundary U {(p, q)} 


In general the Algorithm [l] is only of theoretical value because the sizes 
of sets P and Q can be very large. It can be used for some special property 
functions - for example fi, f 2 , f:i and where the sets P and Q are relatively 
small. 


4 The algorithm 

We propose an algorithm for determining a generalized two-mode core in 
two-mode networks for given thresholds p and q, and property functions / 
and g. 
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The basic idea of the algorithm for generalized two-mode core is to repeat 
removing the nodes that do not belong to it. Since the network is two-mode 
the removing condition depends on to which set a node belongs. 

Algorithm 2 Basic algorithm for determining a generalized two-mode core. 

C^V 

while 3v G C : (v G C\ A f(v, C) < p) V (v G Co A g(v, C ) < q) do 

C = c\ M 


Adapting the proof of Theorem 1 from [Ej to two-mode networks we can 
prove the following theorem. 

Theorem 4.1. The Algorithm [2] determines the generalized two-mode core 
at thresholds (p, q) for every monotonic node property functions f(v,C) and 
g(v, C ). The result of the algorithm does not depend on the order of deletion 
of nodes that do not belong to the core - satisfy the while loop condition. 


Because the order of deletions has no impact on the final result we can, 
in our elaboration of the algorithm, start deleting from the first set, then 
switch to the second set, and back to the first set, etc., until no deletion is 
possible. 

We use a binary heap implementation of priority queues pE] to organize 
the nodes in such a way that we can efficiently get the node with the smallest 
value of the property function as the root element in a heap. The function 
RemoveRoot(heap) returns the root element and removes it from the heap. 
The value of the property function is calculated for each node according to 
the set of links £ and the weight function w as described in Section 3.1| 

We need two heaps - one for each subset of nodes. An element E in a 
heap is a pair E = (key, value) of an identificator of a node as E.key and a 
property function value as E.value. The elements are ordered by the value 
of the property function. We know that all neighbors of every node in the 
first subset are in the second subset, and vice versa. 

To bound the work needed for updating we shall assume in the following 
that both functions / and g are local. In this case we need to update only 
the heap of the neighboring subset of nodes when deleting a node. 

These decisions result in Algorithm [3] for determining generalized ( p , q)- 
core for monotonic and local property functions / and g. 


4.1 Complexity 

Algorithm [3] could be based also on a simpler data structure such as queues. 
We did not explore these options because they can not be efficiently extended 
to Algorithm |4} 
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Algorithm 3 The algorithm to determine the generalized two-mode core at 
thresholds (p, q) for monotonic and local node property functions / and g. 


Input: two—mode network M — ((Vi, V 2 ), C, (/, g), w) , Vi fl V 2 = 0 , 
thresholds p , q . 

Output: generalized two—mode core C = C\ U C 2 ■ 

Procedure: 

Remove (h , t , Ccurrent 1 Mother : heap current ■ lie a |) other ) • 

while not Empty ( heap current) ■ 

if Root (heap current ) .value > t: return 
u = RemoveRoot (heap current ) • key 

Ccurrent ^-'currenJ. \ { ^ } 

for v in N(u,C other ) : 

update ( heap other , v, h(v, C)) 


Algorithm: 

Cl = Vi, c 2 = v 2 

for v in Vi: va lue [v] = f(v,V) 

for v in V 2 : value [u] = g(v,V) 

build ( heapi , value , Ci) , build (heap 2 , value , C 2 ) 

repeat: 

Remo ve(/, p, C { , C 2 , heapi, heap 2 ) 

Remove (g , q, C 2 , Ci , heap 2 , heapi) 

until no node was removed 
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Assume that the time complexity of the calculation of a local property 
function is 0(deg(u)) for each node v in the network. The building of the 
heap takes 0(n ■ logn). We use binary heaps instead of faster heaps with 
amortized constant update time complexity because they are easier to imple¬ 
ment. Because deg(T) = 2m, the time complexity of the initialization 

is 

°(E deg(w) + n logn) = 0 (max(m, nlogn)). 

vGV 

At each step of the while loop some node is removed and its neighbors 
get their property function value changed. They also change their position 
in the heap according to the change of their value. 

The update of the property function value may require less time than its 
calculation. For example the value of fa(v,C ) = deg c (u) is corrected only 
by reducing its value by one for every removed neighbor of node v. The 
update of the value of property functions fa, fa, fa, fa, fa, fa, fa, fu, and fa 3 
takes 0 ( 1 ) time; but for property functions fa, fa, fao, fi 2 , and fu it takes 
0 (deg(w)) for every node v. 

The heaps are implemented in such a way that the change of the position 
of an element in it takes 0 (log n) time. 

Let s = (vi,v 2 , • ■ ■, I’d) be the sequence of the removed nodes during the 
execution of the algorithm. The removal of a node V{ takes 0(log n). The 
update of the property function value of each of its neighbors and the change 
of its position in the heap takes O(logn) or 0 (max(logn, deg(vj))) depending 
on the property function used, and for the sequence s the update costs 

deg(uj) • O(logn) < 2m ■ O(logn) = 0(m ■ log n) 

ViGs 


for property functions fa, fa, fa, fa, fa, fa, fa, fai, and fa 3 or 
Y. deg(nj) ■ 0(max(logn,deg(ui))) 

ViGs 

< 0(max(logn, A)) • E deg(ui) 

ViGs 

< 2 77i ■ 0( max (log n, A)) = 0(m ■ max (log n, A)), 

where A = max„ eV (deg(n)), for property functions fa, fa, fa 0 , fa 2 , and f u . 

The time complexity of the initialization of the algorithm is lower than the 
time complexity of the main loop in the algorithm. So the time complexity 
of the whole algorithm for determining the generalized two-mode core is 


0(m ■ log n) 
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for property functions /i, / 2 , ./s, .A, A, A, .fs, fn, and / 13 and 

0(7n ■ max(log n, A)) 

for property functions f 6 , f 9 , fio, / 12 , and f U - 


4.2 An algorithm for one threshold value fixed 


We adapted the algorithm so that one subset of nodes has a fixed threshold 
value. The result of this algorithm is a vector in which every node has as its 
value the maximum value of the nonhxed threshold for which it is still in the 
corresponding generalized two-mode core. This helps selecting the thresholds 
for which we get the most important generalized two-mode cores. 

Algorithm [4] presents such an adapted version of the Algorithm |3j In this 
algorithm we fix the first threshold. In the case where the second threshold 


is fixed we apply the Lemma 3.1 


The algorithm is again based on the idea of deleting the nodes that do not 
satisfy the threshold. In the elaboration of this algorithm we also use heaps. 
The value of the property function is calculated for each node according to 
the set of links C and the weight function w as described in Section |3Tj The 
nodes are ordered in heaps for both subsets of nodes by the value of the 
property functions. 

A step of the while loop starts by deleting all nodes in the heap for the 
first subset of nodes that do not satisy the fixed threshold. Removed nodes 
get the value of the second threshold in the previous step of the loop, because 
this is the largest value of the second threshold for which the removed node 
is still in the generalized two-mode core. Property function values of some 
neighboring nodes might be changed, so we set the second threshold q to the 
current smallest value in the heap for the second subset of nodes after the 
first part of the step. Then we remove all nodes from the second heap that 
have the property function value equal to the current q. Removed nodes get 
the value q in the resulting vector. At the end of the step we set the previous 
value of q to its current value and its current value to the value of the root 
of the second heap. 

The complexity of Algorithm [4] is the same as the complexity of Algorithm 


5 Applications 

The possibility of using different node properties for the identification of 
important two-mode subnetworks allows the analysts to gain information 
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Algorithm 4 The algorithm to determine the vector of generalized two¬ 
mode core levels at fixed threshold p for monotonic and local node property 
functions / and g. 

Input: two—mode network J\f = (04, V 2 ), £, (/, g), w) , Vi fl V 2 = 0 , 
threshold p . 

Output: vector T, T[v\ = max value of the treshold q such 
that v G Core(p, q). 

Procedures : 

RemoveFixed (/, p, q, C\, C 2 , heapi, heap 2 ): 
while not Empty (heapi) : 

if Root (heapi). value >p: return 
u = RemoveRoot (heapi). key 
T[u} = q 

c 1 = CAM 

for v in A(n,C 2 ) : 

update (heap 2 , v, f(v, C\)) 

RemoveChanging (g, q, C 2 , C\ , heap 2 , heapi) : 
while not Empty (heap 2 ) : 

if Root (heap 2 ). value > q: return 
u = RemoveRoot (heap 2 ). key 
T[u\ = q 

c 2 = c 2 \{n} 

for v in N(u,Ci) '■ 

update (heapi, v, g(v, C 2 )) 


Algorithm: 

c 1 = v 1 ,c 2 = v 2 

for v in Vi : value [n] = f(v, V 2 ), T[v\ = — 1 
for v in V 2 : value [z?] = g(v,V 1 ), T[v] = —1 

q = - 1 

build (heapi , value , Vi), build (heap 2 , value , V 2 ) 

repeat: 

RemoveFixed (f,p,q, C\, C 2 , heapi, heap 2 ) 
if not Empty(heap 2 ): 

q = Root(heap 2 ). value 
RemoveChanging (<7, q, C 2 , C\ , heap 2 , heapi) 
until Empty (heapi) A Empty ( heap 2 ) 
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about different types of groups with a single method. We are continuing to 
search for node properties to include them into our list and the supporting 
program and apply them on real-life data. 

The method of generalized two-mode cores can be applied to different 
real-life data. The input network for the method does not need to be a two¬ 
mode network. The method can also be applied to an one-mode network 
considered as a two-mode network. For example, in an authors’ citation net¬ 
work the ’row’-authors can be considered as users, and the ’column’-authors 
as (knowledge) providers. The use of generalized two-mode cores on a one¬ 
mode network allows the analyst to find a subnetwork that is characterized 
by two different property functions. 

5.1 Social networks 

Let us take a look at an example. Web of Science is a bibliographic database. 
We used the data obtained in 2008 from this database for a query "social 
network*" and expanded with descriptions of most frequent references and 
bibliographies of around 100 social networkers. We constructed some two¬ 
mode networks. Two among them are also the networks works x journals 
and works x authors (193376 works, 14651 journals, and 75930 authors). We 
multiply the transpose of the first network with the second network to get 
the network journals x authors. A journal and an author are linked if the 
author published at least one work in that journal. The weight of a link is 
equal to the number of such works. 

The simplest generalized two-mode cores are the ones with both property 
functions the same. If we select 

f A (v,C) = g A {v,C ) = deg c (w) 

we get the ordinary (p, q)- cores. In a (p, q )-core are the journals that pub¬ 
lished works of at least p different authors in this core and the authors that 
published their works in at least q different journals in this core. 

We determined the generalized two-mode core for functions f A and g A 
and selected threshold values p = 85 and q = 3 for which we get the smallest 
two-mode core. This is one of the generalized two-mode cores on the border 
of (p, q) region. It determines a subnetwork of journals in which at least 
85 authors (in this subnetwork) published their works, and of authors that 
published their works in at least 3 different journals (in this subnetwork). 
There are 4 such journals and 128 such authors. Journals in this two-mode 
core are American Sociological Review (with degree 122), American Journal 
of Sociology (112), Social Forces (90), and Annual Review of Sociology (85). 
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In Table [oTT| those authors are listed that are linked to all 4 journals in this 
two-mode core. 


Table 2: Authors in the Core(85, 3; deg c (u), deg c (w)) that are linked to all 
journals in this two-mode core. 


1 

Breiger, R. 

10 

Kandel, D. 

19 

Olzak, S. 

2 

Burt, R. 

11 

Keister, L. 

20 

Portes, A. 

3 

DiMaggio, P. 

12 

Knoke, D. 

21 

Reskin, B. 

4 

Fischer, C. 

13 

Lieberson, S. 

22 

Ridgeway, C 

5 

Friedkin, N. 

14 

Lin, N. 

23 

Sampson, R. 

6 

Galaskiewicz, J. 

15 

Marsden, P. 

24 

Skvoretz, J. 

7 

Glass, J. 

16 

McPherson, J. 

25 

Thoits, P. 

8 

Kalleberg, A. 

17 

Mizruchi, M. 



9 

Kalmijn, M. 

18 

Nee, V. 




If we want to consider the number of works (the sum of weights of links) 
we can use 

f B (v,C)= w ( v , u ) 

u£N(v,C) 

and gs(v,C ) = deg c (v) stays the same. In this generalized two-mode core 
with thresholds p and q are the journals that published at least p works of 
authors in the core and the authors that published their works in at least q 
journals in the core. 

We determined generalized two-mode cores for these two functions. We 
used the algorithm for one fixed threshold. We selected p G {0,1, 2, 5,10} 
and determined the generalized two-mode cores for all five values of the fixed 
parameter p and the non-fixed parameter q. Fig. [2] presents the diagram of a 
relation between the value of the parameter q and the size of the generalized 
two-mode core. One can notice that the size of the generalized two-mode core 
is not the same for any pair of values (p, q). Because coordinates axes are in 
the logarithmic scale the point for the Core(0, 0; Jr, ()r) is not shown. Its size 
is equal to the size of the set of all nodes. In Fig. [3] the Core(10,12; 
is shown for the two selected property functions. This is the smallest gener¬ 
alized two-mode core for p = 10. In the Core(10,12; Jr, Qr) are included all 
journals in which authors in the core published at least 10 works in total; and 
all authors that each published his/her works in at least 12 different journals 
in the core. The thickness of links represents the number of works an author 
published in a journal - a thicker link means more works. One can notice 
that the journals data were not cleaned because the identification problem 
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appears in Fig. [3] — the following pairs of journal identificators represent the 
same journal: 

• amer sociol rev, am sociol rev: American Sociological Review, 

• amer j sociol, am j sociol: American Journal of Sociology, 

• adm sci q, admin sci quart: Administrative Science Quarterly. 

Identihcator sociol method represents one of the two journals: Sociological 
Methodology and Sociological Methods & Research.These two journals are 
present in Fig. [3]also with identificators sociol methodol and sociological 
methods respectively. 


.. 

• . ’ T V 


i i i i i i i i i i i 111111111 him i ilium i ii i 
7 8 9 1011 13 16 20 23 27 32 38 4550 64 

q 


A Core(0, q) 

• Core(1,q) 
■ Core(2, q) 
T Core(5, q) 

♦ Core(10, q) 


Figure 2: A relation between the parameter q and the size of the generalized 
two-mode core for the fixed values of parameter p and the property functions 
f b and g B . 


We could also select more complex property functions: 

fc(v,C) — max deg(w) — min deg(w) 
uGN(v,C) uEN(v,C ) 


and 


9c(v,C) 


YlueN(v,C) w { v i u ) 

52ueN(v) w ( v > u ) 


In the Core(p, q: f c , g c ) are the journals that published approximately (for a 
small value of p) the same number of works of each author in the core that is 
linked to those journals. And in this core are the authors that published at 
least q% of their works in journals that are in this core. This is one possible 
way to search for journals and authors that are tightly connected. 
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BRIT J MAT 
AMER SOCIOL REV 
KNOWLEDGE 
J ROY STATIST SOC SER A STAT 
AMER J SOCIOL 
SOCIOL METHOD 
POETICS 
J CLASSIF 
QUAL QUANT 
SOCIOLOGICAL METHODS 
REV FR SOCIOL 
CONTEMP SOCIOL 
SOC SCI RES 
SOCIOL METHOD RES 
RATION SOC 
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SOCIOL METHODOL 
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PSYCHOMETRIKA 
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GALASKIE_J 


CARLEY_K 


Figure 3: A generalized two mode core for p — 10 and q — 12 and for the 
property functions fs and ()b- 


We determined the border of ( p , q) region for generalized two-mode cores 
for the property functions fc and (jc■ The border is displayed in Fig. [4} 
At each boundary corner is written a pair of sizes of both sets of nodes in 
a generalized two-mode core. For example the Core(70,0.25; fc, gc) has 26 
nodes in the first set and 118 nodes in the second set. 


6 Conclusion 

In the paper we propose a new direct method for the analysis of two-mode 
networks. We provide an algorithm for determining generalized two-mode 
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P 

Figure 4: The border of the (p,q) region for fc and gc- 


cores and present some examples of its application to real-life data. For the 
efficiency on large sparse networks we exploit the sparsity. The presented 
approach can be straightforwardly extended to r-mode networks for r > 2. 

In our future work we intend to improve the efficiency of the algorithm 
and extend it for a use of the weights measured in nominal scales. We plan 
to make an experimental complexity analysis on random two-mode networks. 
For this task we also need to implement the generation of random two-mode 
networks of different types. 

We already further elaborated Algorithm [3] to produce the nested gener¬ 
alized two-mode cores for one fixed threshold that is shown in Algorithm |4} 
We would like to improve this algorithm further - to produce all (p, q ) pairs 
that determine different generalized two-mode cores and to identify only the 
boundary (p, q) pairs as they are shown in Fig. [I] We also intend to explore 
the structure of the space of all generalized two-mode cores. 

An implementation of the proposed algorithms in Python is available at 
http://zvonka.fmf.uni-lj.si/netbook/doku.php?id=pub:core. 
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